High-Dimensional Data Cubes

نویسندگان

چکیده

This paper introduces an approach to supporting high-dimensional data cubes at interactive query speeds and moderate storage cost. The is based on binary(-domain) that are judiciously partially materialized; the missing information can be quickly reconstructed using statistical or linear programming techniques. enables new applications such as exploratory analysis for feature engineering other fields of science. Moreover, it removes need compromise when building a cube - all columns we might ever wish use included dimensions. Our also up certain dice, roll-up, drill-down operations with hierarchical dimensions compared traditional cubes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Approximate Query Answering in High-Dimensional Data Cubes

Data mining work has been successful in both compressing and modeling large data sets with many values. However, the problem of high-dimensions has not been sufficiently addressed. In our work, we develop a new data reduction method that aims to speed subsequent data analysis by efficiently constructing a high-dimensional, joint probability distribution. This distribution summarizes the data by...

متن کامل

Maximal inequality for high-dimensional cubes

We present lower estimates for the best constant appearing in the weak (1, 1) maximal inequality in the space (R, ‖ · ‖∞). We show that this constant grows to infinity faster than (logn)1−o(1) when n tends to infinity. To this end, we follow and simplify the approach used by J.M. Aldaz. The new part of the argument relies on Donsker’s theorem identifying the Brownian bridge as the limit object ...

متن کامل

Methods for regression analysis in high-dimensional data

By evolving science, knowledge and technology, new and precise methods for measuring, collecting and recording information have been innovated, which have resulted in the appearance and development of high-dimensional data. The high-dimensional data set, i.e., a data set in which the number of explanatory variables is much larger than the number of observations, cannot be easily analyzed by ...

متن کامل

High Performance Data Mining Using Data Cubes on Parallel Computers

On-Line Analytical Processing techniques are used for data analysis and decision support systems. The multidimensionality of the underlying data is well represented by multidimensional databases. For data mining in knowledge discovery, OLAP calculations can be effectively used. For these, high performance parallel systems are required to provide interactive analysis. Precomputed aggregate calcu...

متن کامل

Mining Multi-Dimensional Constrained Gradients in Data Cubes

Constrained gradient analysis (similar to the “cubegrade” problem posed by Imielinski, et al. [9]) is to extract pairs of similar cell characteristics associated with big changes in measure in a data cube. Cells are considered similar if they are related by roll-up, drill-down, or 1-dimensional mutation operation. Constrained gradient queries are expressive, capable of capturing trends in data ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the VLDB Endowment

سال: 2022

ISSN: ['2150-8097']

DOI: https://doi.org/10.14778/3565838.3565839